Scaling an RNS Number Using the Core Function
نویسنده
چکیده
This paper introduces a method for extracting the core of a Residue Number System (RNS) number within the RNS, this affording a new method for scaling RNS numbers. Suppose an RNS comprises a set of co-prime moduli, mi, with ∏mi = M. This paper describes a method for approximately scaling such an RNS number by a subset of the moduli, ∏mj = MJ ≈ √M, with the characteristic that all computations are performed using the original moduli and one other non-maintained short wordlength modulus. 1. Background and Motivation The Residue Number System (RNS) has great potential for accelerating arithmetic operations, achieved by breaking operands into several smaller residues and operating on the residues independently and in parallel. RNS implementations were studied extensively in the 1970's, particularly for DSP applications [1], and led to Inmos' production of an RNS 2-D convolver chip in 1989 [2]. However, wider take-up of RNS for DSP was limited because of a number of fundamental difficulties: • Conversion to binary representation from RNS is difficult (the inverse operation is simple) • Direct magnitude comparison and sign determination of RNS numbers is impossible • Square root operations are not available, and division operations, although available [3], are not practical due to their complexity These difficulties place major constraints on the possible applications of RNS arithmetic. Recently, however, DSP chips using RNS have enjoyed something of a renaissance for a variety of reasons: • They offer high-performance implementations of arithmetic-intensive applications at reduced power supply voltages, important for mobile and wearable computer and communication systems [4] • They avoid lengthy on-chip interconnects, which now represent the major constraint on the realisation of high-performance digital VLSI circuits [5] • They afford hardware-efficient complex multipliers ("QRNS multiplication") comprising two independent multiplications instead of four multiplications and two additions [1] • The component arithmetic operations in an RNS implementation can, without exception, be reduced to short adders and small look-up tables [1] All the items in the above list are applicable to custom VLSI implementations, and the last two also apply advantageously to FPGA implementations [6,7]. Recent industrial interest in RNS confirms the existence and scale of problems faced in implementing DSP algorithms in digital microelectronic fabrics at high clock rates but with low power consumption. For example, reference [8] describes an FIR filter in RNS designed by Texas Instruments because of its low-power capability, and reference [9] discusses a general-purpose DSP engine developed by Siemens that incorporates an RNS vector processor with a considerably higher data processing bandwidth than its binary counterpart. The fundamental difficulties with RNS arithmetic listed earlier have been overcome to some extent by recent innovations in RNS theory. For example, the core function has been shown to be advantageous in converting an RNS number to binary [10], and for adding extra moduli to an RNS in order to increase its dynamic range (“base extension”) [11]. The outstanding problem with RNS processing that prevents its wider take-up is reducing an RNS number's wordlength through scaling − that is, dividing − by a constant with low latency and minimal hardware cost. In binary arithmetic, the scaling constant is invariably set to a power of two so that wordlength reduction is achieved simply by truncating (or rounding) a number. There is no equivalent operation in an RNS with the consequence that the wordlength growth of an accumulated result through a sequence of multiplications, such as is encountered in a multiple-point FFT or in an IIR filter, is very difficult to manage. A number of algorithms for scaling RNS numbers have been reported, but as yet none operates entirely within an RNS. Early attempts at scaling fell into two categories: scaling by one modulus, whereby the RNS number was adjusted to be divisible by one of the moduli, dividing by that modulus in all the other moduli in a single step, and finally base extending the scaled number back into the “scaling modulus” (e.g. [12]); or performing a truncated conversion to binary − that is, scaling by a power of two − followed by conversion back into RNS representation (e.g. [13]). However, these methods are generally slow and require processing of longer wordlength numbers outside the RNS. A major advance was made by Shenoy and Kumaresan [14], who devised a novel decomposition of the Chinese Remainder Theorem that enabled scaling by the product of several moduli. However, their scheme was not optimal in that an extra modulus with a similar wordlength to the existing moduli outside the RNS was employed, requiring extra hardware (typically >10%) for its maintenance, and two redundant channels of residue computation were necessary in the scaling algorithm itself. The total hardware count for Shenoy and Kumaresan's RNS scaler operating on k moduli was k⋅(k+4) modulo arithmetic multiply-accumulates (MACs). Recent work has concentrated on removing the extra modulus in Shenoy and Kumaresan's scheme at the expense of increasing the logical depth of the scaler [15], or of reducing the accuracy of the scaler by limited use of binary arithmetic outside the RNS channels [16]. This paper introduces a novel technique for scaling an RNS number, based on the core function. The method consists simply of extracting the core of the RNS number within the RNS. All computations reduce to inner products within the moduli of the RNS, with one extra inner product using a modulus outside the RNS but not requiring maintenance of the corresponding residue. The paper is structured as follows: first some preliminaries regarding the core function are dealt with; then, the proposed scaling algorithm is introduced along with an example; next, difficulties with the proposed algorithm are identified and a workaround described; finally, the paper concludes with a brief discussion of possible further avenues of research. 2 Scaling an RNS number using the Core Function 2.1 The Core Function The core function is defined for an integer, n, as:
منابع مشابه
An Improved RNS Reverse Converter in Three-Moduli Set
Residue Number System (RNS) is a carry-free and non-weighed integer system. In this paper an improved three-moduli set in reverse converter based on CRT algorithm is proposed. CRT algorithm can perform a better delay and hardware implementation in modules via other algorithms. This moduli is based on p that covers a wide range on modules and supports the whole range of its modules in dynamic r...
متن کاملUsing both Binary and Residue Representations for Achieving Fast Converters in RNS
In this paper, a new method is introduced for improving the efficiency of the Residue Number System, which uses both binary and residue representations in order to represent a number. A residue number system uses the remainder of the division in several different modules. Conversion of a number to smaller ones and carrying out parallel calculations on these numbers greatly increase the speed of...
متن کاملLow Power Realization of Residue Number System Based FIR Filters
In this paper, we present algorithmic and architectural transforms for low power realization of Residue Number System(RNS) based FIR lters. These transforms have been systematically derived so as to achieve power reduction by voltage scaling, switched capacitance reduction and reduction in signal activity. We show how some of the existing techniques can be suitably adopted to RNS based implemen...
متن کاملA Look-Up Scheme for Scaling in the RNS
ÐPrevious scaling schemes have used iterative algorithms, which lead to computation time and memory requirements that increase with the number of moduli. This paper presents a two look-up cycle scaling scheme which allows a diminution of the required memory when nonlarge sets of moduli are considered. This scheme can be combined with previous algorithms for larger sets of moduli. Index TermsÐRe...
متن کاملScaling relations in dynamical evolution of star clusters
We have carried out a series of small scale collisional N-body calculations of single-mass star clusters to investigate the dependence of the lifetime of star clusters on their initial parameters. Our models move through an external galaxy potential with a logarithmic density profile and they are limited by a cut-off radius. In order to find scaling relations between the lifetime of star cluste...
متن کامل